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Psychological Concept of Subjective Probability; 
A Measurement-Theoretic View"^ 



A point of view is presented concerning the psychological 
concept of subjective probability, both to study its relation 
to the . corresponding mathematical and philosophical concepts 
and to provide a framework for the rigorous investigation of 
problems unique to psychology. In order to do this the em- 
pirical implications of axiom systems for measurement are 
discussed first, relying primarily on Krantz's work, with 
special emphasis, however, on some similarities and differences 
between psychological and physical variables. The psycho- 
logical variable of uncertainty is then examined in this 
light, and it is concluded that few, if any, current theories 
are satisfactory when viewed from this perspect 5.ve , particularly 
those deriving from the mathematical work in the axiomatic 
foundations of probability. This might appear to pose diffi- 
culties for applications to real problems of normative decision 
theory when those applications require numerical probability 
judgments from individuals. Two possible solutions are dis- 
cussed briefly. 

The concept of subjective probability, or expectancy, has 
been us ed ' var i ous ly in psychology, often. in ways strongly influ- 
enced by the mathematical or philosophical meanings of that term. 
Psychologists interested in behavioral decision theory have con- 
centrated primarily on three related problems. One has been that 
of how subjective probability combines with other variables, es- 
pecially utility, to determine decisions. (see Rapoport and V/allsten, 
1972, for a review of recent literature).- A second area of re- 
search has centered around the process of subjective probability, 
focusing on questions such as what independent variables influence 
subjective probability, or how is subjective probability revised 
with new information (see, e.g., Edwards, I968; Rapoport and Wall- 
sten, 1972; Tversky and Kahneman, 1972; Wallsten, 1972; Wise, 1970)? 
Finally, investigation by psychologists and others has been directed 
to the e:^perimental measurement of.subjective "orob ability (see, 
e.g., Beach, and Phillips, 1967 ; Stael von Holstein, 1970; V7inkler, 
1967)- ■ ' 
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The term "sub,jective probabi lity is not used identically in 
all cases, although it generally is taken to refer to some aspect 
of an individual's (or a group's) uncertainty or expectation con- 
cerning which subset of a set of events is going to occur, or is 
going to occur most frequently, or is true under specific conditions. 
V7e will attempt some clarifications of the psychological concent in 
this paper, particularly to present a point of view concerning the 
relation between certain mathematical work« and psychological 
research. It is suggested that this might provide a franework 
within which those questions of subjective probability unique 
to psychology may be formulated and investigated. 

The ideas in this paper revolve closely around questions of 
!neasur ement , and a considerable amount of snace is devoted to its 
empirical justification. This is particularly important, since so 
riuch recent psychological work on subjective probability has de- 
pended on measurement in one form or another, often treating 
numbers emitted by subjects as measures of subjective probability 
or odds. However, nothing said here should be taken to imply that 
any theory of behavior under uncertainty must have measurement, 
in the sense to be defined, as one of its ends. If this is not 
one of its purposes, then clearly that theory need not worry 
about — its justification. But, in the absence of evidence or 
reason to support metric assumptions about the data, the theory 
should be qualitative, cr ordinal, in nature, as it will be argued 
should measurement-oriented formulations. An excellent and 
important recent development in this spirit, which allows the 
concept of expectancy to be used meaningfully with infrahumans 
as well asjhumans, is by Irwin (l97l). 



The approach to be advocated in this DaDer has been suggested 
before (V?allste.n, 1970, 1971) > "but can be expanded and made con- 
siderably more clear now in light of Krantz's (l9T^a,b) analysis 
of measurement foundations as qualitative empirical laws. f^irst, 
ve will discuss axiom systems for measurement and their inter- 
pretation as empirical statements. Special attention will be 
paid to some similarities and differences between iDhysical and 
psychological variables in terms of methods for their definition 
and empirical realization. These similarities and differences 
have strong implications for data interpretation and theory con- 
struction in general. Following this we will be in a TDosition 
to consider the psychological concept of sub.lective "o rob ability. 
The paper will end with some comments concerning the relation 
betrween theoretical and applied research in this area. 

Empirical Impli c at i ons of Axiom Systems for Measurement 
Research in. the foundations of measurement is concerned with 

the conditions required of a set of elements ordered with respect 

J 

to a particular qualitative property such that that property may 
be represented numerically in a meaningful fashion, i.e., measured. 
For example, it may be desired to represent the masses of ob.jects, 
individualVs utilities of objects, or individual's subjecti.ve 
probabilities of events numerically* The conditions are' stated 
in the form of axioms about the ordered set, which taken together 
are at least sufficient for the existence of an isomorphic (or 
homorphic) mapping from the set of objects into the real numbers. 
A proof establishing such existence is called a represent at ion 
theorem. A uniqueness theorem establishes the relation that exists 
between any two permissible mappings. 



Recently Krantz, et al {197I5 p. 26ff) have pointed out 
that the search for conditions, leading to measurement scales is a 
search for lawfulness (see also Krantz 19T2a5b; Krantz and Tversky, 
19Tl)- In that sense,, the axioms in reference to a particular 
set of elements and particular operations are em-nirical statements, 
some of which are subject to empirical verification. Thus, the 
construction of measurement scales _d£ novo is accomplished only 
with the development of an appropriate set of laws, or an anpro- 
priate theory. 

To discuss the empirical implications of axiom systems for 
measurement, following Krantz's vlevsj, consider a set of objects 
possessing some qualitative property of interest, for example, 
the physical property of mass. The set' can be empirically ordered . 
with respect to that property, in the case of mass with a pan 
balance. The empirical ordering in general may be denoted by the 
symbol ^ , Thus if rock a in one pan tips the balance when rock b_ 
is in the other pan, we say b^ is no'heavier than a<, or b^j^a^- 
And often, but not always, two elements may be combined, or 
concatenated with respect to the property of interes.t and com- 
pared with a thii-d element. In the present example this would be 
done by placing two rocks in one pan of a balance and -a third rock 
in the other. The concatenation oper&tion in /^ceneral may be 
denoted o. Thus, if rocks b and £ together tip the pan bal- 
ance over a,, we write ^^^i^^S^) * 

It is usually most useful to consider the ordering relation 
and the concatenation operation, respectively, t o " corre snond to 
the relation "less than or ec[ual to", denoted <, and the . operation 
"addition", denoted +, in the real number systeir.. Given a set 
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of reasonable axioms, an isomorphic mapping; associates each em- 
pirical element a (or elass of equivalent empirical elements) with 

a numbei"* which we may call Ms.) such that for all a, b in the 

I 2 

emr)jrical' set, ^^S:^ V/hen- concatenation is em- 

pirically defined and given an appropriate axiom system, the iso- 
morphism also assures that a;£ (!k°£ ) ^^"^ (()(a)<(fi(b)+(f)(c_), V/h en 
the mapping- exist s we may work with the numbers instead of the 
elements, confident that wii'.hin ( o f t en .unspe c i f i abl e ) limits of 
error we are correctly predicting: the r ele vant ■ asne c t s of the 
qualitative property. 

As already mentioned, the qualitative conditions, or axioms, 
which must be satisfied by the elements for ohe mapping to exist, 
thereby allowing us the convenience of numbers, are empirical 
statements. Some of these are formulated in such a way that they 
may be actually subjected to test. For example one of the assump- 
tions is that of "transitivity, if a^^b, and b:^;£, then aiCc_, the 
empirical test of which is clear. If it sy s t emat i c all?/- f ai 1 s 
the desired mapping doiss not exist, and unless another can be 
established, real numbers can not be used to repr.esent the 
particular property of the elements. 

Other axioms, although empirical in principle, are formulated 
in such a way that they usually can not be tested satisfactorily. 
An exaitiple of this is the Archimedian axiom, which essentially 
states that for any pair of elements, aX.b, a_ can be concatenated 
with a sufficient number of identical copies of itself (say ri 
copies, wr itt en for c onveni enc e as rm) so that b^ ^ na< . ~ Clearly 
for a particular set of elements this axiom may not hold for a 
variety of uninteresting reasons, such as, for example, there not 
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being sufficient copies of a. Oc c as i onalLy , hovever, it may- 
fail oa more substantive grou^^ds . Thus, if ve are working with 
velocities and h is the velocity of an electromagnetic vave , 
then regardless of the size of £, it will not be the case that 
b_dC_na» Some of the problems involved in empirically testing axioms 
are discussed in Krantz et al (19T1» p. 28ff), and others in 
Rapoport and Wallsten (1972)- 

As qualitative laws, the axioms rea.uired for the measurement 
of intermediate values of mass, length, and time intervals are 
so obviously true or uninteresting that there is no reason to 
experimentally investigate them (see, e.g., Krantz, 1968), These 
properties were successfully measured long before the procedures 
were theoretically justified. However, the same set of axioms is 
not valid for empirical relational sets with different attributes 
of interest.. » It does not apply, -^or example, when the property 
is utility, intelligence, anxiety, brightness, or almost any other 
likely to arise in the social sciences. 

There are numerous reasons why this set of axioms does not 
generally apply to such properties, but the most important is the 

i 

lack of an empirically defined concatenation operation. It is this 

lack which led Campbell (l920) and others to claim that fundamental 

measurement will never be possible in the social sciences. This 

is clearly wrong, as evidenced by developments in the theory of 

simultaneous conj oint ■ measurement (Luce and Tukey, I96U; KrantZj 

et al , 1971), which provides for the simultaneous measurement of 

two or more variables given cert ain- conditions . A general 

lesson from this work is that qualitative axioms embodying differ- 

I 

ent empirical properties, but necessary and sufficient to Drove 
1 ~ 
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representation and uniqueness theorems can be developed and tested. 
These axioms, which should suggest experiments to "be per- 
'formed, will constitute a theory concerning how the pror:erty or 
properties under consideration are ordered and, perhaps, how each 
property combines with itself or with other properties. Or to out 
it differently, Jbhese axioms will constitut^e a theory concerning 
the qualitative behavior of a set of elements subjected to certain 
operations, when the members of that set are presumed to diff'er 
among each other in the property or properties of interest. In- 
deed, for the Durposes of measurement the properties are defined 
only in terms of the elements* behaviors in response to certain 
empirical operations. 

In that sense, going back to the previous example, for Duri^oses 
of measurement mass is defined only in terms of the behavior of 
rocks in pan balances. The fact that the variable so defined can 
be related to behaviors of many other objects as well and that 
units of mass can be algebraically combined with units of other 
variables in meaningful fashions attests to its vast generality 
and usef ulnes s . 

Similarly in the social sciences, .especially psychology, vari- 
ables may be defined in terms of qualitative laws, or axioms, con- 
cerning the behavior of elements presumed to differ in the parti- 
cular variable or variables. The elements here, however, are 
organisms under different circumstances, or one organism under vari- 
ous circumstances, and, potentially, the variables are any from 
achievement to zoophillsm. Perhaps the earliest work in this 
vein i*s the axiomatization of utility theory by Von Neumann and 
Morganstern ( 19^M . ^ 
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Luce (1972) has argued that the degree of stability and 
generality obtained with measurement of physical variables, parti- 
cularly the ability to combine measures of different variables 
in meaningful algebraic structures, has not been shown yet in 
psychophysics and doubts, therefore, that measurement as it 
axists in physics will ever exist in psychophysics. He has 
explicitly not extended his argument to other areas of psychology, 
not because of evidence to the contrary, but because of a dearth 
of evidence. 

His argument is probably valid for other areas of Dsycholosy, 
such as learning, motivation, or decision theory, which consider 
intervening variables (see footnote H). What- are the reasons? 
Certainly hot that the types of psychological variables under 
consideration need be any less well defined than the physical 
variables. This can be done analogously in both cases, with 
specification of ordinal empirica]. laws. 

A possible well known reason is that psychology has yet to 
detei'inine a small set of variables that in some sense is basic to 
understanding al^l aspects of behavior, and whose inter-r elat onships 
may be specified. Perhaps such a set does not exist. 

There is another reason, which has implications for the use 
that can be made of resulting measurement scales, "and the kinds of 
Interpretations that can be attached to them. Although the 
definitions of physical and psychological intervening variables, 
or properties^ may be equally fundamental, their empirical rea],i- 
zations are not equally simple. Phy s i c al ' var iable s can be made 
manifest- and studied with more-or-less simple apparatus under 
well defined conditions . Thus, elements and combinations of 

ERLC 



elements are ordered with respect to mass by means of a pan bal- 
ance, or other instrument calibrated to reflect the information 
which would be given by a pan balance. And importantly, although 
obvious once mentioned, the variables being investigated are 
independent of the apparatus used to investigate them. A set 
of rocks could be weighed on any suitable nan balance or corre- 
sponding instrument. I^2;noring relativistic consideration, it 
is generally assumed that manipulations on the variables leave 
'unaltered the equipment through which the effects are observed. 
For example, one may observe the pressures of various gases at 
different t emperat uii^es , assuming that the temperature changes 
affect the gases and not the indicating instrument. It is assumed 
that readings on the instrument vill reflect only ' pre-s sure , and 
not other factors, regardless of the temperature. 

The situation is different in psychology. Luce (l9T2) 
has suggested that psy c hophy s i c s might profitably be considered 
the study of a very complicated measuring device that trans- 
duces various inputs into common neural units, rather than consider- 
ing it measurement comparable to physical measurement. For 
intervening variables "studied in other branches of psychology 
matters may be more complex yet. Here the var i abl e s or i gi nat e 
within the organism in response to events internal and external 
to it. That is, to repeat from above, the organisms are the 
elements, and the properties that they embody and we desire to 
measure vary depending on circumstances internal and external 
to the elements. 

But more than that, the organism is also the instrument that 
conveys- information to^ us regarding the qualitative orilering of 
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the property or properties of interest. Thus, the human in v^hose 
uncertainties we are interested tells us by his behavior whether 
his uncertainty under condition A is greater than or less 
than that under B. If h ibit strength and incentive are the variables 
being conjointly measured, the dog^s behavior tells us whet her 
one combination is ordered above or below another. Unlike "ohvfncs 
we can not separate the elements in whcse properties we are inter- 
ested from the device which nakes those nroDerties manifest. They 
are one and the same, namely the living organism^ Mani^pulat ions 
intended to affect the variables of necessity also affect the 
device used to order the variables. Clearly, if the latter is 
not invariant, simple algebraic relations between the former will 
not emerge. Indeed, theories about the variables will often 
depend on theories about the device. 

This, it -is claimed, is a fundamental difference between 
the empirical realization and measurement of variables in 
psychology and physics. And it is for this reason that resulting 
measurement scales will not have the same generaltiy in psychology 
as in physics . 

Sub J ect i ve Probability . • 
Two arguments were developed in the previous section. The ' 
first was that for purposes of original measurement psychological 
variabiles should be defined in terms of qualitative empirical 
stateme.nts which are at least sufficient to prove a representation 
theorem; The second point was that since any behavior which makes 
the variable manifest also reflects other variables, changes in 
behavior over different situations often can not be understood 
without eiTibedding that theory in a more general one encompassing 
the other variables. It is with these considerations before us 
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that we look at the relation between the psychological and mathe- 
.matical/philosophical concepts of subjective pr obabi liity . 

The mathematical work on subjective probability has ceni;ered 
on formulating axioms concerning an ordered set (technically, 
concerning an algebra of sets, to be defined below), which 
t oge ther,.j,are sufficient to prove the existence of a function 
£ from the set into the real interval [0,l] such that the 
three properties of a probability measure hold, viz: P^(A)>0; 
P(X) = 1; and if A Pi B = (}) , then P ( A U B ) ( A ) +P ( B ) ; where X is the 
sample space, or sure event; A, BCX; and ^ is the null event,^ 
As Fishburn (196T) has pointed out, the axi omat i z at i ons have been 
of two forms. In one the elements are ordered by a binary relation 
denoted here , and in which A^B is read "A is not more likely 
than B." Axi omat i zat i ons of this sort are found in de Finetti 
(1937), DeGroot (19TO), Koopman (19^0), Villigas {l96k\ 196?), 
Luce (196T)> and others. They are collectively discussed in 
Chapter 5 of Krantz, et al (l9Tl). DeGroot's (19T0) formulation, 
based on that by Villigas (l96i+), is especially clear. 

In axiom systems of the second form, often called subjectively 
expected utility ( SEU ) theory, the elements are ordered by a 
binary relation denoted here ^ and in which is read "a 

is not preferred to b." Here a and b are conceived of as sure 
commodities or as probability mixtures of outcomes, i.e., 
commodities conditional on uncertain events. A representation 
(utility function) is established from the certain outcomes or 
commodities into the real numbers, and a probability function is ' 
simultaneously established from the uncertain events into the 
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interval [0,l]. Axiom systems of this form are in T^ava^e (195^5 
Fishburn (19T0), and Luce and Krantz (IO71). 

The philf^3ophical impact of this work has been to establish 
a foundation r probability theory as the "opinion of rational 
man/* or as "rational opinion." That is, either binary relation 
^ or ^ , depends on judgments, of the sort humans often make, and 
the axiom systems contain "rational" statements about relative 
likelihood or preference, respectively. If one does not disagree 
with the axioms, one can not disagree with their implications and, 
therefore, probability theory describes rational opinion. {"^he 
fact that mortals may exist who accept the axioms but not their 
conclusions is immater'ial here.) 

Furthermore, de Finetti (1937) proved that a coherent set of 
preferences among probability mixtures of outcomes is a necessary 
and sufficient condition for the derivation of a maioning which 
satisfies the requirements of a probability measure from the un- 
certain events into the interval [o,l]. One's set of preferences 
is coherent if they do not allov: his gambling opponent to select 
options which leave him simultaneously happy (i.e., do nqt violate 
any of his preferences) and guaranteed to lose. Clearly, this is 
a weak requirement for the existence of a probability measure! 

Research in psychology has responded to this work in two ways. 
Various studies have attempted to assess the descri-ntive validity 
of one or more of the axioms especially those concerned with the 
preference relation. Others have, at least implicitly, found the 
axioms so compelling that their concern has been with the measure- 
ment of subjective probability distributions, assuming their 
existence. Much of the latter work is reviewed in Stael von Holstein 
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(1970) and Murphy and Winkler (l970). 

If one accepts the arguments in the first part of this Darker , 
then the psychological variable of subjective probability is de- " 
fined for purposes of measurement by an appropriate het of quali- 
tative empirical lavs which taken together represent a theory 
about that variable-. The axiom systems for relative likelihood 
and for preference are obvious candidates for such laws, and know- 
ledge of their descriptive validity becomes important. Unless 
either of the systems is empirically valid oT another can be found 
which is, measurement of subjective probability, in the Dresent 
sense of that term, is impossible, and theories of that form are 
not useful for describing decision behavior under uncertainty. 

However, as discussed earlier, even when a system is vp.lid, 
interpretation of the derived scale values, or of the measurements, 
is still problematical o It will often depend on t'he more general 
theory in which the specific set of empirical laws is embedded. 
This presents interesting challenges for the theoretician and 
serious problemsfor the practitioner. We will consider the latter 
following the theoretical discussion. . 

Empirical validity of preference based axioms . Research 

since 1965 relevant to the descriptive validity of SEU theory has 

been reviewed by Rapoport and Wallsten (l?T2) and neiBd not be 

discussed here. It will suffice to reproduce their conclusion: 

"...It seems then that the conflicting evideiice 
pertaining to SEU theory is presently irreconcilable. 
Consequently, the basic experimental question should 
not be whether to accept or reject SEU theory as a 
whole, but ratber to systematically discover the 
conditions under which it is or is not valid (Rapoport 
and Wallsten, 1972, p. iHl)." 
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Clearly, when SEU theory does not describe subjects^ choices 
under uncertainty, this does not mean that they do not experience 
uncertainty, nor even that they could not rank order the uncertainty 
associated with the various events. But it does inean, since as 
pointed out before we are simultanesouly studying elements differ.'nf? 
in certain properties (here the subject faced with various gambles) 
and the device (also the subject) that makes the effects of 
these properties manifest, that either our empirical laws 
concerning the former are fundamentally wrong or our theory con- 
cerning the latter is wron^ or incomplete. In either case the 
operational definition of subjective probability is inanprorriate 
and fundamental measurement is rendered impossible. 

Since there are situations in which SEU theory is valid 
(e.g., Tversky, 196?; Wallsten, 1971)5 it would seem that the 
qualitative empirical laws concerning how subjeccive probability 
and utility conjoin are not fundamentally wrong, but rather that 
they must be embedded in a more general theory which encompasses 
other variables as well and allows a priori orediction of when 
these other variables will affect the observed choice behavior. 
We are not prepared currently to offer such a theory, and can 
only suggest that it would represent a major step towards relating 
subjective probability to choice and understanding individual 
decision behavior. 

Empir ic al validity of likelihood based axioms . To the best 
of our knowledge there have been no extensive empirical tests 
of the descriptive validity of these axiom, systems, and vrith good 
reason, since they are probably virtually untestable for any 
interesting sample space. In view of our argument that subjective 
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probability, like other psychological variables, ought to be 
axiomat ically defined, and the fact t hat ^ ele^a:ant , " c ompe Hi n^ 
axiomatizations for a probability representation exist, this 
statement deserves amplification. 

Although most axiom systems require infinite sample spac es , 
some exist for finite spaces (e.g., Kraft, Pratt, and Seidenbergc, 
1959; Fishburn, 19^9 ) . We will discuss the former. 

Consider a nonempty sample space, or- set, X, and a nonempty 
family of subsets of X, 5, in which for every Ae^ , Ae^(A is the 
complement of A), and for every A, Be?, AUBe?- ? is called an 
algebra of sets. In addition if for all A^e?, i=l,2,..., it is 

CO 

the case that U ^^-j. ^i^^' then ? is called a C-algebra or a-field. 

The ordering relation^, "is not more likely than," is de- 
fined over Considering axiomatizations primarily for infinite X, 
if the triple Cx , 5 j ji:^ satisfies five axioms, a probability function, 
P_j exists from ^ into the real interval [0,1]. Various statements 
of the five axioms exist, some of which were referenced above. 
They differ from each other primarily in terms of the fifth axiom. 
The set given here is due to Luce (196?) and reauires ? to be an 
algebra, not necessarily a a-algebra. The five axioms state that 
for all A, B, C, D, A^ . . . , A^ , . . . : 

(1) is a weak order, 

(2) (J><X and (t> ^ A 

(3) If AOB=AnG = (j>, then B^^C iff AUB-CauC 

(U) Archimedian: If A.HA for all i9^i,*^A^, A,^A for 

^ el 1 

rH i_5 then the set of positive integers N={ri [ U ji ^^^^'^ 
7 ~ 

is finite. 
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(5) If AnB-(t), C-CA, D:CB, then there exist, d', Ee£. 
Such that E^AUB, C^c/, D^d\ c'u d'^C E and c'nD = (f). 

The first three axioms are clear, "but the fourth and fifth 
require comment. The Archimedian axiom states that for an event 
A strictly less likely than B, "but with p(a)>0, only a finite 
number of disjoint events equally likely as A may "be joined "by the 
union operation in a subset which is also strictly less likely than 
B, Or in other words, if P(a)>0, then any subset containing d^+1 
"identical copies of A" has probability greater than a subset con- 
taining i^ copies. Luce's actual formulation of this axiom avoids 
the problems involved in there being an insufficient number of A^ 
and assures that the sequence of ^^^"2^ formed as £=1,2,..., is 
bounded by X. 

Axiom 5 is best paraphrased by Luce himself: . . i f A and 

B are disjoint and dominate C and D respectively, then there are 
disjoint subsets of aUb that are eauivalent in probability to C 
and D (Luce, 196?, p. T81) ." 

Axioms 1-^ are necessary for a probability r e-nr e s ent at i on , but 
not sufficient. That is, given the existence of a representation, 
1 through h will not be viol at ed. But the converse is not neccssarilv 
true. An example' of an ordering; satisfying Axioms 1-3 does 
not £ipply), but not admitting of a probability representation has 
been provided by Kraft, Pratt, and Seidenberg, (1959). Thus, a 
fifth axiom is needed to limit the structures to which 1 through 
k will be applied. The one above from L\\ce (196T) is weaker than 
most in that it applies to some finite X as well as infinite X. 
Most others apply only to infinite X. 



For example, the first four axioms by DeGroot (l970) are 
equivalent to those proposed by Luce (1967). But his system 
reouires 5 to be a a-alsebra, and the fifth axiom is that there 
exists a random variable which has a uniform distribution on the 
interval [0,l]. Then the original sDace is enlarpred bv com- 
posin^y it with that of the new random variable and the continuous, 
uni formally distributed random variable is used to establish the 
probability mapping from the original F, into the [0,l] interval. 

It should be obvious why the systems shown here, which are 
typical, will not easily lend themselves to emDirical study. The 
first three axioms will rarely fail, although with some ingenuity 
one could probably arrange failures of the first. Furthermore, if 
we were to test them on sets to which the system is restricted by 
axiom 5^ ^e would in general be renuired to use infinite sets, and 
a complete test would be impossible. This latter point is relatively 
minor, since systematic failure of axioms 1, 2, or 3 would su"^fice 
to reject the probability representation for any set. However, i^ 
it is agreed beforehand that failure in any circumstance is un- 
likely, then success is not very interestin/3:. An' empirical test 
of axiom U would also be very unlikely to fail, but, as with any 
Archimedian axiom, any extensive test would require hufce num'bers 
of observations, and is quite infeasible. 

Finally, althou^^h it may be easy to create circumstances in 
which Luce's axiom 5 is rejected, that does not imply the non- 
existence of a probability representation. And in general it would 
be impossible to establish its success, since an infinite number 
of judgments .would be required. 



-18- 

At first "blush, DeGroot's axiom 5 vould appear to be empirically 

sound; we might, for example, introduce a perfectly "balanced sninrier 

8 

which randomly picks points around the unit circle. But in light 
of the difficulties Davidson, Suppes, and Siepel (1957) had in 
esta'blishing a "binary event with su"bjective pro"ba"bility one-half, 
it is doubtful that a continuous varia"ble with a uniform subjective 
distribution could be found. 

If one can not test the axioms, one might look for other 
necessary conditions with more empirical content. Thus, Ellsberp 
(1961) has demonstrated, and Becker and Brownson (196^4) have firmly 
substantiated, that ambiguity affects whether huTnan subjective 
probability can be represented by a probability function. Pnecifi- 
cally, using only binary choices in an informal experiment Ellsherg 
(1961) demonstrated that for many subjects R.p-'B and r?/f^BA> 
approximately, but f^^^f^y and B^^B^ , where B^j is the event of 
drawing a red ball on a single trial from an urn unambiguously 
containing 50 red and 50 black balls, and B^ is the event of ob- 
taining a black ball from that urn on one draw. is the event 
of drawing a red ball in a single trial from an ambiguously con- 
stituted urn in which it is only known that there are 100 red 
and black balls. is the corresponding event for the black 
ball. Clearly that set of binary judgments can not be represented 
by a probability measure* 

Undoubtedly, with sufficient skill and insight there can be 
discovered other necessary conditions for the representation and 
other variables which cause them to be violated. However, the 
first question is whether under appropriate conditions one can find 
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an interesting sample space which is not so large as to preclude 
an experiment, and for which linear inequalities "based on "binai^y 
Judgments and the assumption of a probability measure are solvable. 
As far as we know such an experiment has not been done. If the re- 
sults of such an experiment were to be positive, then we would know 
that at least under some circumstances the psychological variable of 
uncertainty would result in behavior with ordinal -pfroperties 
which are consistent with a probability measure. The fact that 
SEU theory has been occasionally validated provides onl>f the 
mildest support ■ for such a statement about human uncertainty, 
because the chance events in these experiments have rarely been 
more than binary. 

The question, whether there are situations in which human 
behavior is consistent with probability theory is important, but 
only of limited usefulness to psychology. It is important, because 
we would clearly like to know the conditions under which human 
and "rational" opinion agree. Furthermore a considerable amount 
of research, is concerned currently with developing methods, such 
as proper scoring rules (Stael von Holstein, 19T0), for measuring 
an individual's "true" subjective probability, and it would be 
well to know >rhen that concept is well defined. 

But the question is only of limited interest to psychology, 
because by itself it has the potential of explaining only a very 
narrow segment of decision behavior. The more fruitful psychological 
questions concern how uncertainty arises and is affected by other 
factors and how it combines with other psychological variables to 
determine choices. It is within theories designed to answer these 
questions that one would like to define and perhaps measui'e 
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uncertainty. That is one reason why SFU theory has been extensively 
studied. 

Other Syjt_ern_s. Note that when the matter is t)ut this way it 
is still required that the psychological variable of uncertainty 
be defined in terms of qualitative laws relating it to behavior. 
Assuming that the laws are such as to allow fundamental measurement, 
and there is no compelling reason why they should be, there is no 
requirement that the measurement conform to the rules of probability 
theory. The properties of the scales will depend on r e nr e s ent at ion 
and uniqueness theorems, and their interpretation will depend on 
the nature of the general theory. Unfortunately there are very 
few theories concerning behavioral aspects of uncertainty" which 
meet the criteria discussed here. 

As an example of one that does at least in part, using the 

theory of simultaneous conjoint measurement (Krantz, et al , 1971; 

Krantz and Tversky, 1971), Wallsten (1972) presented a very p^jsneral 

additive (and under some conditions distributive) model for re-^ 

vision of opinion in the presence of probabilistic information.;; 

This is the usual Bayesian probability revision ta''6k': Although 

the required experiments are complicated, the model is easy to 

state and will be given here in the distributive form, which 

applies when choosing between two alternative hypotheses , 

and Y , on the basis of a sample of ri identical events, E. These 
J 

restrictions have both advantages and disadvantages which need 
not be discussed here. This form- of the model states that 

R(X^,Yj^|nE) = f{(J)3(n)[(J)^(Elx. )-(J)2(e|Yj^)]^ [l] 
where the left side of the equation refers to a response with 
at least ordinal properties concerning the likelihood of X. 
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relative to Y on the "basis of n_ events. (j) is a real valued 

function whose domain is sample size and vhich refers to the 

su"bjective diagnostic value of that number of replicates of 7,, 

(|)^ is a real valued function representing the subjective conditional 

likelihood of E given X^. Given a particular E, the domain cf 

cj>^ is traced out by varying . (J)^ is the corresponding subjective 

function for E given Y . 

Relying only on the ordinal properties of- the data one may 
check the empirical validity of certain of the axioms necessary 
for the model. Assuminf^ neither gross nor systematic failures, 
numerical representations for the scale values can be derived. 
Hovever, the 'scales themselves are of very little import. Of 
greater interest is isolation of those variables which cause one 
or more of the axioms to fail and interpretation of relations 
between the scale values when the model does not fail. It is 
worth describing some of the experimental results to clarify 
these concept s . 

Thus, Wallsten (1912) found the model .to be reasonably accurate 
for eight of his 12 subjects and to fail in well defined ways for 
two others. When the model was valid the derived values for <l> ^ 
decisively showed that information samples of two identical events 
carried considerably, less than twice the diagnostic weight of one of 
those events alone. (j)^ and (^^ could be plotted against objective 
measures for the likelihood of E given X. and Y . , respectively. 
The'ratio of the slope. of the least squares^ best fitting line for 
<j>^ to that for (j)^ was invariant under all permissible transformations 



ERIC 



-22- 



of the two functions, and was greater than 1*0 for all subjects. 

This was interpreted either in terms of attentional factors or 

an avers ion towards uncertainty, the latter interpretation "beinp 

at variance with the former results mentioned. 
9 

Wallsten, in a much more extended experiment scoring responses 
with the spherical scoring rule, "but still relying only on ordinal 
properties of the data, found the model to hold ^*or roughly the 
same percentage of subjects. The results concerning values of c()^ 
were replicated and extended to samples 0|f size three. Looking 
at the effect s of payoffs determined by the scoring rule on the 
relation between (f)^ and (j)^^ the aversion of uncertainty inter- 
pretation was rendered much less likely, since the raxio of 
slopes was less than 1.0 for some subjects. If the attention 
interpretation is acceptable, then the effects of Dayoffs were to 
increase between subject variability in that factor, since there 
was considerably greater between' sub j ect variability in the slope 
ratios than there was in the previous study. 

Finally, V/allsten and Delaney"^^ using a Mars chak "bidding pro- 
cedure, but still only the ordinal data properties, again found 
the model reasonably accurate for about two-thirds of the subjects. 
The results for samples of two or three identical events were 
again replicated. But now, upon analyzing samples of size three 
with two identical events and one. different, the effect virtually • 
disappeared. Here the two identical events were just about twice 
as diagnostic as that event appearing singly. Clearly the com- 
position of the information sample affects the subjective values 
of its components. 
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Ir. this experiment (b^ could be plotted di rectD.y as fit function 
of 6^5 without regard to other values. Biased payoff matrices 
resulted in differential utilities for the alternatives "between 
which the subjects were deciding on the "basis of the information. 
This differential utility slightly affected the derived value s for 
(|)^ and c})^ in a way consistent with the attention interpretation 
and with the su"bjects' pro"ba"bilit.ies of choice "between the al- 
ternatives. The choice pro"ba"bi li t i es themselves were very strongly 
affected. 

Certainly this work is of limited scope. For example, it 
says nothing a"bout how the revised opinions com'bin.e with other 
factors to determine final decisions. And so far the work has 
been confined to simple alternative hypotheses and simple infor- 
mation samples. However, it does illustrate remarks in the first 
half of this paper and shovrs some of the difficulties in applying 
them to real data. Thus, an important unanswered question con- 
cerns those factors responsible for the model's success with 
some subjects and failure with others. Also, the concept of 
attention seemed to be useful above, but it still remains to make 
that concept more precise and bring it under better experimental 
control. This series of experiments also demonstrates that with 
appropriate designs ordinal data can be very rich. 

An example of another investigation of the determinants of 
subjective probability relying" primarily on ordinal data is that 
by Kahneman and Tversky (l9T2) and Tversky and Kahneman (l9T2). 
They showed that sub j ect s * j udgment s of probability were strongly 
influenced by the degree of similarity between a samDle and the 
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population from which' it was drawn. In other circumstances the 
,i-udp,ments were determined in part "by the ahility to recall i^ast 
instances of the event in Question/ These d emons t rat ions are 
potentially very important and it. is to he hoped they will he 
replicated in rigorous experiments. Note that, as the authors 
themselves point out, they have not proposed a theory, hut have 
provided Qualitative data for which, if replicated, an-*'- theorv 
will have to account. 

Applied Decision Theory 



Conspicuous hy its absence thus far has heen any mention of 
data involving subjects* numerical estimates of prohah ilit ie s . 
This is consistent with the philosophy of this paper, that the 
psychological variable of uncertainty is qualitative, like any 
other variable, and only when certain conditions a.re met can it 
be represented by numbers.. And then the numbers must reflect the 
behavior induced by that variable, not be generated more or ].ess 
independently of that behavior. However, -nrobability (and utilitv) 
numbers are needed in the application of normative decision theorv 
to real problems, and if the applications are to make any sense at al3 
it is necessary that they reflect something of the on in ions and 
values of the people involved. 

It is primarily for this reason that many ingenious methods 
have been devised to assist subjects in generating probability 
numbers which reflect their "true^* opinions. Chief among these 
methods is the family of strictly proper scoring rules, which re- 
quires subjects to give estimates identical oo their subjective 
probabilities in order to maximize their subjectively expected 
gain. One of these rules is used routinely to score weather 
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forecasters (Murphy and Epstein, 1967). There are also various 
other methods involving; fractionation of sample spaces and 
hypothetical experiments (see Pratt, et al, 1965 ; vStael von Holstein, 
1970, Chapter 5; Winkler, I967). But since it has not yet 
been established that human opinion can be prqperly represented 
by a probability function-, and we concluded that at best rarely will 
that be the case, there would appear to be a basic conflict. V/e 
can suggest two possible solutions. 

One of them is easier to state than to execute. It is 
desirable to have practice conform as closely to theory as Dossible. 
Thus initially one might attempt to calibrate responses obtained 
using a particular method with scale values derived from a theory 
valid in that situation. After the calibration has been pain- 
stakingly established the method could be used routinely. This 
is vhat is done when a spring scale is used instead of a "nan 
balance to weigh objects. Of course, if calibrated scales can 
be established, they will not in general be probabilities, .although 
this should only occasionally be a problem. However, considering 
the numerous variables which influence either verbal responses 
or the validity of a theory, such calibration procedures are 
quite unlikely to be successful. 

One was tried recently by Wa3,lsten (see footnote 9) to 
evaluate a strictly proper scoring rule. In that study subjects* 
probability estimates in the revision of opinion task were scored 
with a spherical scoring rule. It will be recalled that the 
distributive model for opinion revision provided a reasonably 
accurate description of most subjects* behavior, and scale values 
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could be derived from that model. Considering: only those subjects 
for whom it was concluded that the distributive model was valid, 
if the maximization principle upon which the scorinp- rule rests 
also described their behavior, there should exist an identity 
transformation relating their nrobability estimates to the scale 
values combined according to the distributive model. Of those 
subjects whose behavior was judged to be described by the distri- 
butive model we selected the one whose responses were best des- 
cribed by the scoring rule and the one whose rest^onses were most 
poorly described by the scoring rule, and present in Figure I 
the monotonic transformation of their estimates that best fits 
the model values. In this framework, subject 3PP was reDortinei: 
his "true" opinion and 2Prj was not. Clearly, it is not an 
easy matter to evaluate when the scoring rule is, forcing sub- 
jects to be "honest," much less to establish a general calibration 
function. 

The other solution is to treat applied decision theory in a 
manner similar to classical test theorv. That, it appears to us, 
is possible and appropriate. That is, the auestion v;h ether a 
set of responses reflects " true"opinion or not is irrelevant. The 
aim is to obtain reliable responses which are valid, i.e., correlate 
highly with other external criteria. The criteria might be other 
measures of uncertainty or they might be ultimr'tely s at i s f ac t or;^'- 
decisions. Hoffman and Peterson (1972) put the matter veil when 
they described the use of a proper scoring rule to assist the 
assessor in learning "what kinds of numbers are warranted by 
different states of knowledge (Hoffman and Peterson, 19T2, d . 2)." 
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Figure I: Data from two subjects showing for each the monotonic 
transformation of the original probability estimates 
(response) that best fits the distributive model of 
opinion revision (Adanted from V/allsten, see footnote 9) 
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In essence, however, this is the approach already being 
followed by some researchers. Thus Murphy and VJinkler wrote 
that .perhaps the most important attribute of the (verbally 

estimated) probabilities is their 'validity', i.e., the association 
between the probability statements and the actual outcomes 
(Murphy and Winkler, 19T0, p. 28).*' Winkler and MurDhy (1968) 
and Murphy and Winkler (l97l) discuss further the association 
between an estimate and the occurence of an event on a single 
occasion and the correspondence between a collection of similar 
estimates and the appropriate relative freauencies. Alpert and 
Raiffa (1969) reported an experiment evaluating; properties of ver- 
bally assessed probability distr ibut ions vhich ver e necessary for 
them to represent the actual distributions of various uncertain 
quantities, A brief general discussion of external validity appears 
in Chapter h of Stael von Holstein (l9T0), 

Although logically prior to validity, the Question of 
reliability does not appear to have been treated in this literature. 
There has been some worry about how to elicit probability distri- 
butions from various assessors in a manner designed to reduce be- 
tween assessor variability (Winkler, I968), But so far as ve know, 
there has -been nc attempt to discover which assessment techniques 
result in the most highly correlated estimates within a single 
subject when he is in similar circumstances two or more times. This 
is clearly important, since unreliable estimates will not sys- 
tematically correlate with any other reliable criterion and 
may, therefore, lead to low validity. 
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None of this is to claim that the theoretical approaches 
discussed above should hold no interest for practioners, nor that 
theorists should ignore non-laboratory iDroblems. Neither is the 
case. A theory is relatively useless if it can not predict 
behavior outside the exDer imertal laboratory, and the "oractioner 
will be considerably aided by knowing the psycholoecy of the decision 
situation in which he is working. 
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^read "if and only if" . 

is defined from 

of int ervening 
by Edward C , 

Tolman, is similar in many respects to the concept of a measurable 
variable discussed above. Tolman (l936) formalized this notion 
so that mentalistic properties such as exp^ect ancy , valence , or 
representation could be operationally defined and take their place 
in rigorous, behaviorist ic psychological theory. Intervening 
variables were operationally define"d in terms of the relations 
they predicted between independent variables and dependent be- 
havi or . 

For. many years the concept of an intervening variable was 
both influential and controversial in research on learning and 
motivation (see Koch, 1959^ and Chapter 5 of Marx, 196k, for a 
glimpse of the later stages of debate.) A definition that was 
central in the debate, and appears to capture the essence of 
intervening variables, was offered by MacCorquodale and Meehl 
(19^+8) when they wrote: 



^is defined from ^ in the same way that < 

<. That is, xAjl iff ^itl. Z^L* 

k 

It is interesting to note that the concept 
var i able , first introduced in psychology in 1936 
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"...First, the statement of such a concept does 
not contain any words which are not reducible to 
the empirical laws. Second, the validity of the 
empirical laws is both necessary and sufficient 
for the 'correctness' of the s tat ement s about 
the concept. Third, the quantitative expression 
of the concept can be obtained withp^Uit .mediate 
inference by suitable groupings of term's in 
the quantitative empirical laws" ( MacCorquodale 
and Meehl, 19^8, p.' 107)." 

Note the first two characteristics outlined by MacCorquodale 

and Meehl, that the concept is defined only in terms of empirical 

laws, and that their validity is necessary and sufficient for 

the statements about the concept to be correct. This provides 

an excellent description of the approach advocated in the present 

paper . 

The thir.d characteristic, concerning quantification of the 
concept, differs considerably from the present approach. A feature 
of using measurement foundations as empirical laws is that the 
laws themselves are qualitative, not quantitative, but taken 
together they allow the concept to be expressed numerically. 

^The usual set theoretic notation is used here and through- 
out the rest of the paper: 

aeA means "a is an element of A" 
AHB means "A intersect B" 
AUB means "A union B" 
ACx means "A is a subset of X" 

The term "weak order" means that the ordering is connected, 
i.e., either A t!^B or B^A; reflexive, i.e., 'A A ; and trans i tive , 
defined earlier. 
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This formulation is different from Luce's (1967). However, 
the simplicity gained for the present discussion is well worth 
the subtlej although importantj conceptual problems introduced. 
The int erested^is referred to Luce (I96T). 

DeGroot (l9T0, p. TT ) suggests that the statistician 
might imagine such an ideal device for the purpose of comparing 
relative likelihoods of other events Ae^< 

9 M 

Manuscript in preparation entitled A simultaneous evaluation 
of a con j o int^'-measur ement model for revision of opinion and a 
strictly proper -sc-oring rule," 

"^^Manuscr ipt in preparation entitled "The effects of a biased 
payoff matrix on probabilistic information processing." 
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